Demonstration of the CROSSMARC System

نویسندگان

  • Vangelis Karkaletsis
  • Constantine D. Spyropoulos
  • Dimitris Souflis
  • Claire Grover
  • Ben Hachey
  • Maria Teresa Pazienza
  • Michele Vindigni
  • Emmanuel Cartier
  • José Coch
چکیده

Vangelis Karkaletsis , Constantine D. Spyropoulos , Dimitris Souflis , Claire Grover , Ben Hachey , Maria Teresa Pazienza , Michele Vindigni , Emmanuel Cartier , José Coch Institute for Informatics and Telecommunications, NCSR “Demokritos” vangelis, costass @iit.demokritos.gr Velti S.A. [email protected] Division of Informatics, University of Edinburgh grover, bhachey @ed.ac.uk D.I.S.P., Universita di Roma Tor Vergata pazienza, vindigni @info.uniroma2.it Lingway emmanuel.cartier, Jose.Coch @lingway.com

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Retrieval and Extraction from the Web: the CROSSMARC approach

The paper presents the CROSSMARC approach for the complex task of identification of interesting web sites and web pages and the extraction of information from them. This task is hard because most of the information on the Web today is in the form of HTML documents, which are designed for presentation purposes and not for automatic extraction systems. This task becomes even harder in a multiling...

متن کامل

Named Entity Recognition in Greek Web Pages

We describe the functionalities of the Hellenic Named Entity Recognition and Classification (HNERC) system developed in the context of the CROSSMARC project. CROSSMARC is developing technology for e-retail product comparison. The CROSSMARC system locates relevant retailers’ web pages and processes them in order to extract information about their products (e.g. technical features, prices). CROSS...

متن کامل

Domain-Specific Web Site Identification: The CROSSMARC Focused Web Crawler

This paper presents techniques for identifying domain specific web sites that have been implemented as part of the EC-funded R&D project, CROSSMARC. The project aims to develop technology for extracting interesting information from domain-specific web pages. It is therefore important for CROSSMARC to identify web sites in which interesting domain specific pages reside (focused web crawling). Th...

متن کامل

Use of Ontologies for Cross-lingual Information Management in the Web

We present the ontology-based approach for crosslingual information management of web content that has been developed by the EC-funded project CROSSMARC. CROSSMARC can be perceived as a meta-search engine, which identifies domainspecific information from the Web. To achieve this, it employs agents for web crawling, spidering, information extraction from web pages, data storage, and data present...

متن کامل

Cross-lingual Information Extraction from Web pages: the use of a general-purpose Text Engineering Platform

In this paper we present how the use of a general-purpose text engineering platform has facilitated the development of a cross-lingual information extraction system and its adaptation to new domains and languages. Our approach for crosslingual information extraction from the Web covers all the way from the identification of Web sites of interest, to the location of the domainspecific Web pages,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003